今天要使用helm來安裝 Prometheus Operator,原先的prometheus operator chart已經被棄用,目前轉為新的prometheus-community/kube-prometheus-stack,而kube-prometheus就是基於prometheus + prometheus operator上的設定與佈署並整合在kubernetes上。
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
kubectl create ns monitoring
helm install homelab-monitoring prometheus-community/kube-prometheus-stack --version 18.0.8 -n monitoring
可以設定的實在太多,就從預設的安裝來反看吧。
安裝的instance,簡單來說就是prometheus、prometheus operator、alertmanager、grafana
另外也會根據prometheus operator產出crd,以及不少的custom resources
舉例prometheuses的custom resource來看
kubectl get prometheuses.monitoring.coreos.com homelab-monitoring-kube-pr-prometheus -o yaml -n monitoring
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
annotations:
meta.helm.sh/release-name: homelab-monitoring
meta.helm.sh/release-namespace: monitoring
creationTimestamp: "2021-09-15T13:39:16Z"
generation: 1
labels:
app: kube-prometheus-stack-prometheus
app.kubernetes.io/instance: homelab-monitoring
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/part-of: kube-prometheus-stack
app.kubernetes.io/version: 18.0.8
chart: kube-prometheus-stack-18.0.8
heritage: Helm
release: homelab-monitoring
name: homelab-monitoring-kube-pr-prometheus
namespace: monitoring
resourceVersion: "304609"
uid: 8ebae130-9d60-462c-bdd7-be25853c1754
spec:
alerting:
alertmanagers:
- apiVersion: v2
name: homelab-monitoring-kube-pr-alertmanager
namespace: monitoring
pathPrefix: /
port: web
enableAdminAPI: false
externalUrl: http://homelab-monitoring-kube-pr-prometheus.monitoring:9090
image: quay.io/prometheus/prometheus:v2.28.1
listenLocal: false
logFormat: logfmt
logLevel: info
paused: false
podMonitorNamespaceSelector: {}
podMonitorSelector:
matchLabels:
release: homelab-monitoring
portName: web
probeNamespaceSelector: {}
probeSelector:
matchLabels:
release: homelab-monitoring
replicas: 1
retention: 10d
routePrefix: /
ruleNamespaceSelector: {}
ruleSelector:
matchLabels:
app: kube-prometheus-stack
release: homelab-monitoring
securityContext:
fsGroup: 2000
runAsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: homelab-monitoring-kube-pr-prometheus
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector:
matchLabels:
release: homelab-monitoring
shards: 1
version: v2.28.1
從這份文檔當中,我們可以大概得知的設定如資料保存為預設10天 ,且由於我們一開始並沒有設定任何參數
像podmonitor、servicemonitor、rules、probe 這些在custom resources建立時都需要matchLabels release: homelab-monitoring
,所在namespaces的選擇則是不受限制。
繼續看一下預設幫忙產出的metrics service
可以看到servicemonitor也有被建立
將prometheus port-forward到本機看看
kubectl port-forward svc/homelab-monitoring-kube-pr-prometheus 9090:9090 -n monitoring
可以看到kube-proxy、kube-controller-manager狀態為dwon,kube-proxy是因為我們只有開啟127.0.0.1的服務監聽,導致使用IP時無法存取,修改過後便能夠使用。而kube-controller-manager則是因為設定上關閉了http的存取,僅能使用身份驗證的https,這邊都先想辦法讓他打開,再檢查後如下
這邊認真查了一下1.20的文件kube-controller-manager上面已經找不到
--port=0
,而port號使用也從10252修改為10257,但透過kubespray佈署的static pod上面仍存在--port=0
這段設定,索性把他註解掉之後發現還是有作用的,10252 Port又出來了listen了。我的感覺是,看起來在後續的版本這個helm chart這邊都會需要再做調整(完全棄用時),而servicemonitor存取的metrics service這塊最終看起來都會希望是以https+serviceaccount(authorized)作為較高安全性的考量來存取。
可以看到預設已帶入非常多dashboard
到這邊就可以根據需求在grafana上做一些可觀測性的工作
接著看看 alertmanagers.monitoring.coreos.com 這邊的alertmanager custom resource,如下所示,alertmanagerConfigNamespaceSelector: {}
& alertmanagerConfigSelector: {}
預設就是全部的namespaces且不限制config
apiVersion: v1
items:
- apiVersion: monitoring.coreos.com/v1
kind: Alertmanager
metadata:
annotations:
meta.helm.sh/release-name: homelab-monitoring
meta.helm.sh/release-namespace: monitoring
creationTimestamp: "2021-09-15T13:39:16Z"
generation: 1
labels:
app: kube-prometheus-stack-alertmanager
app.kubernetes.io/instance: homelab-monitoring
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/part-of: kube-prometheus-stack
app.kubernetes.io/version: 18.0.8
chart: kube-prometheus-stack-18.0.8
heritage: Helm
release: homelab-monitoring
name: homelab-monitoring-kube-pr-alertmanager
namespace: monitoring
resourceVersion: "304594"
uid: 5f05b7a3-ed7f-43bf-b1b1-be6e907aa187
spec:
alertmanagerConfigNamespaceSelector: {}
alertmanagerConfigSelector: {}
externalUrl: http://homelab-monitoring-kube-pr-alertmanager.monitoring:9093
image: quay.io/prometheus/alertmanager:v0.22.2
listenLocal: false
logFormat: logfmt
logLevel: info
paused: false
portName: web
replicas: 1
retention: 120h
routePrefix: /
securityContext:
fsGroup: 2000
runAsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: homelab-monitoring-kube-pr-alertmanager
version: v0.22.2
kind: List
metadata:
resourceVersion: ""
selfLink: ""
而設定alermangerconfig的方式也與一般alertmanager相同,可以參考連結,最後這邊就不繼續設定下去囉。
今天做LAB的時候才赫然發現,以前在用的prometheus operator竟然棄用了而我卻渾然不知,看來真的是太久沒裝了QQ,還有昨天loki-stack的Grafana 版本7.x,今天的kube-prometheus Grafana 版本8.x,登入頁面長不一樣帶給我的衝擊都好大唷XD。